set.seed(530476)
N = 500
w = rnorm(N)
plot.ts(w, main = "White Noise Process, n = 500")
acf(w, lag.max = 20, main = "White Noise Process (sample) ACF, n = 500")
#x_acf = acf(x, na.action = na.omit)
actual acf:
x = cbind(c(0:20),c(1,rep(0,20)))
plot(x[,1],x[,2], type = "h", main = "Population ACF ", xlab = "Lag", ylab = "ACF"); abline(h=0)
The sample ACF and Theoretical population ACF are very similar. While with the true ACF there is no autocorrelation beyond the 0th lag (because White Noise process is Independent), in the sample ACF we see there is maybe a little autocorrelation between lags, but it is not statistically significant.
set.seed(530476)
N = 50
w = rnorm(N)
plot.ts(w, main = "White Noise Process, n = 50")
acf(w, lag.max = 20, main = "White Noise Process (sample) ACF, n = 50")
#x_acf = acf(x, na.action = na.omit)
The main difference appears to be that the second time series (with n = 50) is less cluttered. In terms of a difference seen in the sample and actual ACF plots, when n goes from 500 -> 50, the sample autocorrelation in the n = 50 case seems to be (slightly) larger (though still statistically insignificant).
set.seed(530476)
N = 500
w = rnorm(N)
x = filter(w, filter=rep(1/3,3))
some_ts = x
plot(some_ts)
x_acf = acf(x, na.action = na.omit)
include an image/sketch of actual ACF
when n = 50:
set.seed(530476)
N = 50
w = rnorm(N)
x = filter(w, filter=rep(1/3,3))
some_ts = x
plot(some_ts)
x_acf = acf(x, na.action = na.omit)
-a)
jj = JohnsonJohnson
t = time(jj)-1970 #center it
jjReg = lm(log(JohnsonJohnson) ~ 0 + t + as.factor(cycle(JohnsonJohnson)))
summary(jjReg)
##
## Call:
## lm(formula = log(JohnsonJohnson) ~ 0 + t + as.factor(cycle(JohnsonJohnson)))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.29318 -0.09062 -0.01180 0.08460 0.27644
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## t 0.167172 0.002259 74.00 <2e-16 ***
## as.factor(cycle(JohnsonJohnson))1 1.052793 0.027359 38.48 <2e-16 ***
## as.factor(cycle(JohnsonJohnson))2 1.080916 0.027365 39.50 <2e-16 ***
## as.factor(cycle(JohnsonJohnson))3 1.151024 0.027383 42.03 <2e-16 ***
## as.factor(cycle(JohnsonJohnson))4 0.882266 0.027412 32.19 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1254 on 79 degrees of freedom
## Multiple R-squared: 0.9935, Adjusted R-squared: 0.9931
## F-statistic: 2407 on 5 and 79 DF, p-value: < 2.2e-16
sum(jjReg$coefficients)
## [1] 4.334172
The estimated average annual increase in logged earnings per share of Johnson & Johnson stock is approximately 4.33
jjReg$coefficients[4]-jjReg$coefficients[5]
## as.factor(cycle(JohnsonJohnson))3
## 0.2687577
Assuming the model is correct, averaged logged earnings would decrease by .26875. This is a percentage decrease of about 23.3%.
If you include an intercept in the model, it includes the effect of Quarter 1 earnings in each of the succeeding quarters (think of it as the baseline that following quarters are compared to). We don’t want this because we want to be able to easily isolate the unique effect of each quarter on (logged) earnings per share. Including an intercept in the model also leads to the issue of perfect multicollinearity (otherwise known as dummy variable trap).
library(aTSA)
##
## Attaching package: 'aTSA'
## The following object is masked from 'package:graphics':
##
## identify
Qtr = factor(cycle(jj))
trend = time(jj) -1970
reg = lm(log(jj) ~ 0 + trend + Qtr,na.action = NULL)
plot.ts(log(jj))
lines(fitted(reg),col = 3)
The model appears to fit the data well (maybe almost too well? perhaps some overfitting). The residuals appear to be pretty small since the model is so closely fit to the data.
library(astsa)
##
## Attaching package: 'astsa'
## The following object is masked _by_ '.GlobalEnv':
##
## jj
#variance of the first half of the varve (glacier) time series:
var(varve[1:(length(varve))/2])
## [1] 132.501
#variance of the second half
var(varve[(length(varve)/2):length(varve)])
## [1] 592.9645
logTransformedYt = log(varve)
ts = cbind(varve,logTransformedYt)
plot.ts(ts)
var(logTransformedYt[1:(length(logTransformedYt))/2])
## [1] 0.269403
var(logTransformedYt[(length(varve)/2):length(logTransformedYt)])
## [1] 0.4506843
The glacier time series, Xt, appears to display heteroskedasticity, since the variance is non-constant throughout the entire series (significantly larger in the second half of the series than in the first).
However, upon performing a log-transformation on the time series, the variance appears to stabilize.
#nontransformded
hist(varve)
#log transformed
hist(logTransformedYt)
The transformation does also appear to improve the normality assumption. Whereas the raw time series is heavily right skewed, once it is log transformed it looks much more normal.
plot.ts(logTransformedYt)
There does not appear to be any comparable behavior over the given time intervals between the two time series.
acf(logTransformedYt)
log transformed series (Yt) appears to be highly persistent. There is statistically significant autocorrelation between every observation and its 25+ different lags.
Ut = logTransformedYt - lag(logTransformedYt,1)
plot.ts(Ut)
acf(Ut)
Both the time series plot and the ACF plot appear to support that notion that the times series is stationary. The time series does not appear to have any sort of time trend, as it is centered around 0 and stays that way for the duration of the series. Its variance also doesn’t appear to depend on time, as it also appears to be constant.
The ACF plot supports this. Aside from a autocorrelation at lag 1, the autocorrelations are practically 0 for the remainder of the series, which means the series’ ac(v)f is not dependent on time, another indicator that this time series is indeed stationary.